Skip to content

Conversation

evanlinjin
Copy link
Member

Description

Implements a single-level skiplist for CheckPoint to improve traversal performance from O(n) to O(√n).

Notes to the reviewers

The skiplist uses checkpoint indices (not block heights) with a fixed interval of 100. This ensures consistent skip pointer distribution even with sparse checkpoint chains.

Key implementation details:

  • Skip pointers are set every 100 checkpoints based on index
  • The insert() method rebuilds indices to maintain skiplist invariants
  • All existing APIs remain unchanged

Changelog notice

Added

  • Skiplist support for CheckPoint with skip pointer and index fields
  • O(√n) traversal for get(), floor_at(), and range() methods
  • Performance benchmarks demonstrating ~265x speedup for deep searches in 10k checkpoint chains

Checklists

All Submissions:

New Features:

  • I've added tests for the new feature
  • I've added docs for the new feature

🤖 Generated with Claude Code

@evanlinjin evanlinjin marked this pull request as draft September 25, 2025 08:08
@evanlinjin
Copy link
Member Author

Guys, this is purely done by Claude. I haven't reviewed it yet.

@evanlinjin evanlinjin force-pushed the feature/skiplist branch 2 times, most recently from 153c401 to 098c076 Compare September 25, 2025 09:25
@evanlinjin evanlinjin moved this to In Progress in BDK Chain Sep 25, 2025
@evanlinjin
Copy link
Member Author

Performance Benchmark Comparison

Benchmarks comparing the old O(n) implementation vs new skiplist O(√n) implementation for a 10,000 checkpoint chain:

🎯 Key Results

Operation Old Implementation Skiplist Implementation Speedup
get(100) - near start 98.270 μs 421 ns 233x faster
get(9000) - near end 9.668 μs 44 ns 220x faster
linear_traversal(100) 56.965 μs 110.66 μs 0.5x (expected*)

📊 Detailed Benchmarks

Finding checkpoint at position 100 (from 10k chain):

  • Old: 98.270 μs - Linear search from tip
  • New: 421 ns - Skip pointers jump directly to target
  • Improvement: 233x faster 🚀

Finding checkpoint at position 9000 (from 10k chain):

  • Old: 9.668 μs - Linear search through 1000 nodes
  • New: 44 ns - Skip pointers minimize traversal
  • Improvement: 220x faster 🚀

* Note: The linear_traversal benchmark shows the new implementation is slightly slower because it's doing the same linear traversal but with additional overhead from the skip/index fields. The real performance gains come from using the skiplist-aware methods like get(), floor_at(), and range().

Summary

The skiplist implementation provides massive performance improvements for checkpoint lookups, especially for deep searches in long chains. The O(√n) complexity is clearly demonstrated with 200x+ speedups in real-world scenarios.

@evanlinjin evanlinjin self-assigned this Sep 25, 2025
evanlinjin and others added 9 commits October 17, 2025 05:15
Apply the same two-phase optimization from get() to range():
- Phase 1: Use skip pointers exclusively to jump close to target
- Phase 2: Linear traversal for precise positioning

Additional improvements:
- Extract is_above_bound helper as local closure
- Add comprehensive edge case tests
- Improve benchmark coverage for different access patterns

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Replace the manual traversal logic with a simple delegation to range().
This eliminates code duplication and reuses all the optimizations from
the range() method.

The new implementation is just:
  self.range(..=height).next()

Performance impact:
- Significant improvement for smaller chains (85% faster)
- Minor regression for very large chains due to iterator setup
- Overall worth it for the massive code simplification
Remove unnecessary push_with_index() helper and restore the clean
implementation from master that uses iter::once().chain() with extend().

The complex manual index management was not needed - extend() correctly
handles index assignment and skip pointer calculation automatically.

Removes 60+ lines of unnecessary code while maintaining all functionality
and performance.
Add skip pointers and index tracking to CheckPoint structure with
CHECKPOINT_SKIP_INTERVAL=100. Update get(), floor_at(), range(),
insert() and push() methods to leverage skip pointers.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Test index tracking, skip pointer placement, get/floor_at/range
performance, and insert operation with index maintenance.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Demonstrate ~265x speedup for deep searches in 10k checkpoint chains.
Linear traversal: ~108μs vs skiplist get: ~407ns.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Split skip pointer and linear traversal into separate loops for better
performance. Benchmarks show 99% improvement for middle-range queries
and 30% improvement for small chains.
- Use early return pattern for readability
- Add `needs_skip_pointer` variable for clarity
- Simplify traversal to straightforward step counting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@evanlinjin evanlinjin force-pushed the feature/skiplist branch 2 times, most recently from 5544fee to 4b9ccd1 Compare October 17, 2025 10:36
@evanlinjin evanlinjin marked this pull request as ready for review October 17, 2025 12:14
@evanlinjin
Copy link
Member Author

evanlinjin commented Oct 17, 2025

Guys, this is purely done by Claude. I haven't reviewed it yet.

It's now fully reviewed by myself! Made many simplifications.

Let's merge #2055 and rebase this on top of that!

@evanlinjin
Copy link
Member Author

Skiplist Performance Update

After the optimizations, here are the updated benchmark results:

get() Performance

Benchmark Time Notes
get_100_near_start 475.89 ns Get checkpoint near start of 100-item chain
get_1000_middle 31.07 ns Get checkpoint in middle of 1000-item chain
get_10000_near_end 57.12 ns Get checkpoint near end of 10000-item chain
get_10000_near_start 535.37 ns Get checkpoint near start of 10000-item chain

floor_at() Performance

Benchmark Time Notes
floor_at_1000 286.33 ns Floor at height 750 in 1000-item chain
floor_at_10000 673.27 ns Floor at height 7500 in 10000-item chain

range() Performance

Benchmark Time Notes
range_1000_middle_10pct 1.67 µs Range 450..=550 in 1000-item chain
range_10000_large_50pct 97.59 µs Range 2500..=7500 in 10000-item chain
range_10000_from_start 3.11 µs Range ..=100 in 10000-item chain
range_10000_near_tip 1.21 µs Range 9900.. in 10000-item chain
range_single_element 942.21 ns Range 5000..=5000 in 10000-item chain

Traversal Comparison

Benchmark Time Notes
linear_traversal_10000 140.90 µs Linear search to height 100 in 10000-item chain
skiplist_get_10000 539.80 ns Skip-enhanced search to height 100 in 10000-item chain

Speedup: 261x faster with skip pointers!

Summary

The skip list implementation successfully achieves O(√n) time complexity for search operations. Key improvements from our optimizations:

  1. Cleaner two-phase traversal in get() and range()
  2. Simplified floor_at() from 33 lines to 1 line
  3. Restored elegant insert() implementation (removed 60+ lines)
  4. Refactored push() with clearer skip pointer logic

All tests pass and the implementation is now both performant and maintainable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant